AITopics | representation network

Collaborating Authors

representation network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning -- Supplementary Material -- AT abular Experiments

Neural Information Processing SystemsOct-2-2025, 20:18:27 GMT

Here, we discuss some additional settings for the tabular experiments. The reason for this is that Sarsa(0.95), in contrast to MB-VI and MB-SU, is a multi-step Therefore, there is stochasticity in the update target even in deterministic environments due to exploration of the behavior policy. All methods used optimistic initialization. The pseudocode of the tabular, on-policy method used in Section 5.1 is shown in Algorithm 1. These estimates are updated at the end of the episode, using the data gathered during the episode.

experiment, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

Add feedback

MHSNet:An MoE-based Hierarchical Semantic Representation Network for Accurate Duplicate Resume Detection with Large Language Model

Li, Yu, Chen, Zulong, Xu, Wenjian, Wen, Hong, Yu, Yipeng, Yiu, Man Lung, Yin, Yuyu

arXiv.org Artificial IntelligenceSep-8-2025

To maintain the company's talent pool, recruiters need to continuously search for resumes from third-party websites (e.g., LinkedIn, Indeed). However, fetched resumes are often incomplete and inaccurate. To improve the quality of third-party resumes and enrich the company's talent pool, it is essential to conduct duplication detection between the fetched resumes and those already in the company's talent pool. Such duplication detection is challenging due to the semantic complexity, structural heterogeneity, and information incompleteness of resume texts. To this end, we propose MHSNet, an multi-level identity verification framework that fine-tunes BGE-M3 using contrastive learning. With the fine-tuned , Mixture-of-Experts (MoE) generates multi-level sparse and dense representations for resumes, enabling the computation of corresponding multi-level semantic similarities. Moreover, the state-aware Mixture-of-Experts (MoE) is employed in MHSNet to handle diverse incomplete resumes. Experimental results verify the effectiveness of MHSNet

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.13676

Country: Asia > China > Zhejiang Province (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

4f5aeaee95e528a0ec5040bfa2fe9303-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 19:01:56 GMT

atari game, expansion, virtual expansion, (12 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hidden Representation Clustering with Multi-Task Representation Learning towards Robust Online Budget Allocation

Wang, Xiaohan, Zhang, Yu, Jiang, Guibin, Cheng, Bing, Lin, Wei

arXiv.org Artificial IntelligenceJun-3-2025

Marketing optimization, commonly formulated as an online budget allocation problem, has emerged as a pivotal factor in driving user growth. Most existing research addresses this problem by following the principle of 'first predict then optimize' for each individual, which presents challenges related to large-scale counterfactual prediction and solving complexity trade-offs. Note that the practical data quality is uncontrollable, and the solving scale tends to be tens of millions. Therefore, the existing approaches make the robust budget allocation non-trivial, especially in industrial scenarios with considerable data noise. To this end, this paper proposes a novel approach that solves the problem from the cluster perspective. Specifically, we propose a multi-task representation network to learn the inherent attributes of individuals and project the original features into high-dimension hidden representations through the first two layers of the trained network. Then, we divide these hidden representations into $K$ groups through partitioning-based clustering, thus reformulating the problem as an integer stochastic programming problem under different total budgets. Finally, we distill the representation module and clustering model into a multi-category model to facilitate online deployment. Offline experiments validate the effectiveness and superiority of our approach compared to six state-of-the-art marketing optimization algorithms. Online A/B tests on the Meituan platform indicate that the approach outperforms the online algorithm by 0.53% and 0.65%, considering order volume (OV) and gross merchandise volume (GMV), respectively.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Artificial Intelligence

2506.00959

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Revealing Bias Formation in Deep Neural Networks Through the Geometric Mechanisms of Human Visual Decoupling

Ma, Yanbiao, Liu, Bowei, Dai, Wei, Chen, Jiayi, Li, Shuo

arXiv.org Artificial IntelligenceFeb-17-2025

Deep neural networks (DNNs) often exhibit biases toward certain categories during object recognition, even under balanced training data conditions. The intrinsic mechanisms underlying these biases remain unclear. Inspired by the human visual system, which decouples object manifolds through hierarchical processing to achieve object recognition, we propose a geometric analysis framework linking the geometric complexity of class-specific perceptual manifolds in DNNs to model bias. Our findings reveal that differences in geometric complexity can lead to varying recognition capabilities across categories, introducing biases. To support this analysis, we present the Perceptual-Manifold-Geometry library, designed for calculating the geometric properties of perceptual manifolds.

artificial intelligence, machine learning, manifold, (16 more...)

arXiv.org Artificial Intelligence

2502.11809

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Orthogonal Representation Learning for Estimating Causal Quantities

Melnychuk, Valentyn, Frauen, Dennis, Schweisthal, Jonas, Feuerriegel, Stefan

arXiv.org Artificial IntelligenceFeb-6-2025

Representation learning is widely used for estimating causal quantities (e.g., the conditional average treatment effect) from observational data. While existing representation learning methods have the benefit of allowing for end-to-end learning, they do not have favorable theoretical properties of Neyman-orthogonal learners, such as double robustness and quasi-oracle efficiency. Also, such representation learning methods often employ additional constraints, like balancing, which may even lead to inconsistent estimation. In this paper, we propose a novel class of Neyman-orthogonal learners for causal quantities defined at the representation level, which we call OR-learners. Our OR-learners have several practical advantages: they allow for consistent estimation of causal quantities based on any learned representation, while offering favorable theoretical properties including double robustness and quasi-oracle efficiency. In multiple experiments, we show that, under certain regularity conditions, our OR-learners improve existing representation learning methods and achieve state-of-the-art performance. To the best of our knowledge, our OR-learners are the first work to offer a unified framework of representation learning methods and Neyman-orthogonal learners for causal quantities estimation.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2502.04274

Country:

North America > United States (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Diffusion Auto-regressive Transformer for Effective Self-supervised Time Series Forecasting

Wang, Daoyu, Cheng, Mingyue, Liu, Zhiding, Liu, Qi, Chen, Enhong

arXiv.org Artificial IntelligenceDec-7-2024

Self-supervised learning has become a popular and effective approach for enhancing time series forecasting, enabling models to learn universal representations from unlabeled data. However, effectively capturing both the global sequence dependence and local detail features within time series data remains challenging. To address this, we propose a novel generative self-supervised method called TimeDART, denoting Diffusion Auto-regressive Transformer for Time series forecasting. In TimeDART, we treat time series patches as basic modeling units. Specifically, we employ an self-attention based Transformer encoder to model the dependencies of inter-patches. Additionally, we introduce diffusion and denoising mechanisms to capture the detail locality features of intra-patch. Notably, we design a cross-attention-based denoising decoder that allows for adjustable optimization difficulty in the self-supervised task, facilitating more effective self-supervised pre-training. Furthermore, the entire model is optimized in an auto-regressive manner to obtain transferable representations. Extensive experiments demonstrate that TimeDART achieves state-of-the-art fine-tuning performance compared to the most advanced competitive methods in forecasting tasks. Time series forecasting (Harvey, 1990; Hamilton, 2020; Box et al., 2015; Cheng et al., 2024b) is crucial in a wide array of domains, including finance (Black & Scholes, 1973), healthcare (Cheng et al., 2024c), energy management (Zhou et al., 2024). Accurate predictions of future data points could enable better decision-making, resource allocation, and risk management, ultimately leading to significant operational improvements and strategic advantages. Among the various methods developed for time series forecasting (Miller et al., 2024), deep neural networks (Ding et al., 2024; Jin et al., 2023; Cao et al., 2023; Cheng et al., 2024b) have emerged as a popular and effective solution paradigm. To further enhance the performance of time series forecasting, self-supervised learning has become an increasingly popular research paradigm (Nie et al., 2022).

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.05711

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
Oceania > New Zealand (0.04)
Oceania > Australia (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (0.87)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SeCo-INR: Semantically Conditioned Implicit Neural Representations for Improved Medical Image Super-Resolution

Ekanayake, Mevan, Chen, Zhifeng, Egan, Gary, Harandi, Mehrtash, Chen, Zhaolin

arXiv.org Artificial IntelligenceSep-2-2024

Implicit Neural Representations (INRs) have recently advanced the field of deep learning due to their ability to learn continuous representations of signals without the need for large training datasets. Although INR methods have been studied for medical image super-resolution, their adaptability to localized priors in medical images has not been extensively explored. Medical images contain rich anatomical divisions that could provide valuable local prior information to enhance the accuracy and robustness of INRs. In this work, we propose a novel framework, referred to as the Semantically Conditioned INR (SeCo-INR), that conditions an INR using local priors from a medical image, enabling accurate model fitting and interpolation capabilities to achieve super-resolution. Our framework learns a continuous representation of the semantic segmentation features of a medical image and utilizes it to derive the optimal INR for each semantic region of the image. We tested our framework using several medical imaging modalities and achieved higher quantitative scores and more realistic super-resolution outputs compared to state-of-the-art methods.

neural representation, representation, seco-inr, (13 more...)

arXiv.org Artificial Intelligence

2409.01013

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Predicting Polymer Properties Based on Multimodal Multitask Pretraining

Wang, Fanmeng, Guo, Wentao, Cheng, Minjie, Yuan, Shen, Xu, Hongteng, Gao, Zhifeng

arXiv.org Artificial IntelligenceJun-7-2024

In the past few decades, polymers, high-molecular-weight compounds formed by bonding numerous identical or similar monomers covalently, have played an essential role in various scientific fields. In this context, accurate prediction of their properties is becoming increasingly crucial. Typically, the properties of a polymer, such as plasticity, conductivity, bio-compatibility, and so on, are highly correlated with its 3D structure. However, current methods for predicting polymer properties heavily rely on information from polymer SMILES sequences (P-SMILES strings) while ignoring crucial 3D structural information, leading to sub-optimal performance. In this work, we propose MMPolymer, a novel multimodal multitask pretraining framework incorporating both polymer 1D sequential information and 3D structural information to enhance downstream polymer property prediction tasks. Besides, to overcome the limited availability of polymer 3D data, we further propose the "Star Substitution" strategy to extract 3D structural information effectively. During pretraining, MMPolymer not only predicts masked tokens and recovers 3D coordinates but also achieves the cross-modal alignment of latent representation. Subsequently, we further fine-tune the pretrained MMPolymer for downstream polymer property prediction tasks in the supervised learning paradigm. Experimental results demonstrate that MMPolymer achieves state-of-the-art performance in various polymer property prediction tasks. Moreover, leveraging the pretrained MMPolymer and using only one modality (either P-SMILES string or 3D conformation) during fine-tuning can also surpass existing polymer property prediction methods, highlighting the exceptional capability of MMPolymer in polymer feature extraction and utilization. Our online platform for polymer property prediction is available at https://app.bohrium.dp.tech/mmpolymer.

information, property prediction, representation, (14 more...)

arXiv.org Artificial Intelligence

2406.04727

Country:

North America > United States > California > Yolo County > Davis (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Government > Regional Government > North America Government (0.46)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > Polymers & Plastics (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Domain Invariant Representation Learning and Sleep Dynamics Modeling for Automatic Sleep Staging

Lee, Seungyeon, Pham, Thai-Hoang, Cheng, Zhao, Zhang, Ping

arXiv.org Artificial IntelligenceDec-9-2023

Sleep staging has become a critical task in diagnosing and treating sleep disorders to prevent sleep related diseases. With growing large scale sleep databases, significant progress has been made toward automatic sleep staging. However, previous studies face critical problems in sleep studies; the heterogeneity of subjects' physiological signals, the inability to extract meaningful information from unlabeled data to improve predictive performances, the difficulty in modeling correlations between sleep stages, and the lack of an effective mechanism to quantify predictive uncertainty. In this study, we propose a neural network based sleep staging model, DREAM, to learn domain generalized representations from physiological signals and models sleep dynamics. DREAM learns sleep related and subject invariant representations from diverse subjects' sleep signals and models sleep dynamics by capturing interactions between sequential signal segments and between sleep stages. We conducted a comprehensive empirical study to demonstrate the superiority of DREAM, including sleep stage prediction experiments, a case study, the usage of unlabeled data, and uncertainty. Notably, the case study validates DREAM's ability to learn generalized decision function for new subjects, especially in case there are differences between testing and training subjects. Uncertainty quantification shows that DREAM provides prediction uncertainty, making the model reliable and helping sleep experts in real world applications.

dataset, representation, sleep stage, (13 more...)

arXiv.org Artificial Intelligence

2312.03196

Country:

North America > United States > Ohio (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Sleep (0.87)
Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback